14 research outputs found

    A versatile programming model for dynamic task scheduling on cluster computers

    Get PDF
    This dissertation studies the development of application programs for parallel and distributed computer systems, especially PC clusters. A methodology is proposed to increase the efficiency of code development, the productivity of programmers and enhance performance of executing the developed programs on PC clusters while facilitating improvement of scalability and code portability of these programs. A new programming model, named the Super-Programming Model (SPM), is created. Programs are developed assuming an instruction set architecture comprised of SuperInstructions (SIs). SPM models the target system as a large Virtual Machine (VM); VM contains functional units which are underlain with sub-computer systems and SIs are implemented with codes. When these functional units execute SIs, their codes will run on member computers to perform the corresponding operations. This approach resembles the process of designing instruction sets for microprocessors but the VM employs much coarser instructions and data structures. SIs use Super-Data Blocks (SDBs) as their operands. Each SI is assigned to a single member computer and is indivisible (i.e., its implementation is not interrupted for I/O). SIs have predictable execution times because SDB sizes are limited by predefined thresholds. These qualities of SIs help dynamic load balancing. Employing software to implement instructions makes this approach more flexible. The developed programs fit to architectures of cluster systems better. SPM provides mechanisms, such as dynamic load balancing, to assure the efficient execution of programs. The vast majority of current programming models lack such mechanisms for distributed environments that suffer from long communication latencies. Since SPM employs coarse-grain tasks, the overall management overhead is small. SDB access can often overlap the execution of other SIs; a cache system further decreases average memory latencies. Since all SDBs are virtual entities, with the runtime system support, they can be accessed in parallel and efficiently minimizes additional constraints to parallelism from underlying computer systems. In this research, a reference implementation of VM has been developed. A performance estimation model is developed that takes these features into account. Finally, the definition of scalability for parallel/distributed processing is refined to represent a multi-dimensional entity. Sample cases are analyzed

    A Super-Programming Technique for Large Sparse Matrix Multiplication on PC Clusters

    No full text
    this paper, we apply our super-programming approach [24] to parallel large matrix multiplication on PC clusters. In our approach, tasks are partitioned into super-instructions that are dynamically assigned to member computer nodes. Thus, the load balancing logic is separated from the computing logic; the former is taken over by the runtime environment . Our super-programming approach facilitates ease of program development and targets high efficiency in dynamic load balancing. Workloads can be balanced effectively and the optimization overhead is small. The results prove the viability of our approac

    A Super-Programming Technique for Large Sparse Matrix Multiplication on PC Clusters

    No full text
    The multiplication of large spare matrices is a basic operation for many scientific and engineering applications. There exist some high-performance library routines for this operation. They are often optimized based on the target architecture. The PC cluster computing paradigm has recently emerged as a viable alternative for high-performance, low-cost computing. In this paper, we apply our super-programming approach [24] to study the load balance and runtime management overhead for implementing parallel large matrix multiplication on PC clusters. For a parallel environment, it is essential to partition the entire operation into tasks and assign them to individual processing elements. Most of the existing approaches partition the given sub-matrices based on some kinds of workload estimation. For dense matrices on some architectures estimations may be accurate. For sparse matrices on PC, however, the workloads of block operations may not necessarily depend on the size of data. The workloads may not be well estimated in advance. Any approach other than run-time dynamic partitioning may degrade performance. Moreover, in a heterogeneous environment, statically partitioning is NP-complete. For embedded problems, it also introduces management overhead. In this paper We adopt our super-programming approach that partitions the entire task into medium-grain tasks that are implemented using super-instructions; the workload of super-instructions is easy to estimate. These tasks are dynamically assigned to member computer nodes. A node may execute more than one super-instruction. Our results prove the viability of our approach

    A super-programming approach for mining association rules in parallel on PC clusters

    No full text
    Abstract: PC clusters have become popular in parallel processing. They do not involve specialized inter-processor networks, so the latency of data communications is rather long. The programming models for PC clusters are often different than those for parallel machines or supercomputers containing sophisticated inter-processor communication networks. For PC clusters, load balancing among the nodes becomes a more critical issue in attempts to yield high performance. We introduce a new model for program development on PC clusters, namely the Super-Programming Model (SPM). The workload is modeled as a collection of Super-Instructions (SIs). We propose that a set of SIs be designed for each application domain. They should constitute an orthogonal set of frequently used high-level operations in the corresponding application domain. Each SI should normally be implemented as a high-level language routine that can execute on any PC. Application programs are modeled as Super-Programs (SPs), which are coded using SIs. SIs are dynamically assigned to available PCs at run time. Because of the known granularity of SIs, an upper bound on their execution time can be estimated at static time. Therefore, dynamic load balancing becomes an easier task. Our motivation is to support dynamic load balancing and code porting, especially for applications with diverse sets of inputs such as data mining. We apply here SPM to the implementation of an apriori-like algorithm for mining association rules. Our experiments show that the average idle time per node is kept very low

    Control Protocol and Self-adaptive Mechanism for Live Virtual Machine Migration over XIA

    No full text
    Part 3: Virtualization and Cloud Computing TechnologiesInternational audienceFIA (Future Internet Architecture) is supported by US NSF for future Internet designing. XIA is one of the projects which comply with clean slate concept thoroughly. Meanwhile, virtual machine migration technique is crucial in cloud computing. As a network application, VM migration should also be supported in XIA. This paper is an experimental study aims at verifying the feasibility of VM migration over XIA. We primarily present intra-AD (Administrative Domain) and inter-AD VM migration with KVM instances. The procedure is achieved by a migration control protocol which is suitable for the characters of XIA architecture. Moreover, an elementary self-adaptive mechanism is introduced to maintain VM connectivity and connection states. It is also beneficial for VM migration in TCP/IP network. Evaluation results show that our solution well supports live VM migration in XIA and all the communications leading to VM can be kept uninterrupted after migration

    Microstructure and Tensile Properties of the Mg-6Zn-4Al-<i>x</i>Sn Die Cast Magnesium Alloy

    No full text
    The effect of various Sn contents (1&#8315;2 wt. %) on the microstructure, age hardening response, and tensile and casting properties of the high-pressure die cast Mg-6Zn-4Al alloy were studied. All as-cast alloys consisted of &#945;-Mg and icosahedral quasi-crystalline phase; and the addition of 2% Sn caused the formation of Mg2Sn phases. Dendrite structure and eutectic phases were observably refined by Sn addition. The hot tearing susceptibility of the die cast Mg-6Zn-4Al alloy prominently decreased with increasing Sn addition. During T6 heat treatment, Sn addition did not obviously affect the time to reach peak hardness, but significantly enhanced the age hardening response and improved the strength of alloys under peak-aged conditions. Compared to single aging, double aging resulted in the higher density of finer &#946;1&#8242; and &#946;2&#8242; precipitates. The double aged Mg-6Zn-4Al-1Sn alloy offered the optimum tensile properties among all conditions. The yield strength, ultimate tensile strength, and elongation were 209 MPa, 305 MPa, and 4.3%, respectively

    Modeling Distributed Data Representation and its Effect on

    No full text
    PC clusters have emerged as viable alternatives for high-performance, low-cost computing. In such an environment, sharing data among processes is essential. Accessing the shared data, however, may often stall parallel executing threads. We propose a novel data representation scheme where an application data entity can be incarnated into a set of objects that are distributed in the cluster. The runtime support system manages the incarnated objects and data access is possible only via an appropriate interface. This distributed data representation facilitates parallel accesses for updates. Thus, tasks are subject to few limitations and application programs can harness high degrees of parallelism. Our PC cluster experiments prove the effectiveness of our approach
    corecore